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(57) ABSTRACT 

The invention provides improved techniques for mulliplica- 
tioD of signals represented in a normal basis of a finite field. 
An illustrative embodiment includes a first rotator which 
receives a first input signal representative of a first normal 
basis field element (aQ aj . . . a„_i), and a second rotator 
which receives a second input signal representative of a 
second normal basis field element (b^, . , . b^_j). A word 
multiplier receives output signals from the first and second 
rotators, corresponding to rotated representations of the first 
and second elements, respectively, and processes the rotated 
representations w bits at a time to generate an output signal 
representative of a product of the first and second elements, 
where w is a word length associated with the word multi- 
plier. The rotated representation of the first element may be 
given by A[i]=(a.- a,-^i . . . a^^^.^), the rotated representation 
of the second element may be given by B[i]=(b,- b.^^ . . . 
^uy^-dy and the product may be given by c=(C[0], Ctw], 
C[2w], . . , , C[m-w]), where C[i]=<c, C.^^ . . . c, ^^_i), ra 
is the degree of the finite field, w is the word length, and i=0. 
1, . , . m-1. The invention is particularly well suited for 
implementation in software, and can provide performance 
advantages for both general normal basis and optimal nor- 
mal basis. 

16 Claims, 4 Drawing Sheets 
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EFFICIENT FINITE FIELD degree m. Given any element a eGF(2'^, one can write 
MULTIFUCATION IN NORMAL BASIS 

RELAIXD APPUCAnON = Z ' "^^^ ^ 

5 

The present application claims the priority of U.S. Pro- 
visional Application Ser. No. 60/070,193 filed in the name of , . • c u w 1- *• • 11,^,.^.^ 
; f. V JD XT- « rio^ -vi 1 007 In normal basis, field mulUphcation IS generally earned 
inventors Yiqun Lisa Yin and Peng Nmg on Dec, 30, 1997 ' ^ w u «« ™ r« 
L enUUed "Efficient Software taplemenlation for Finite "«'°g ^. °''^''Pl'"''°° f ^ ■"-"y"' 
Keld MulUplication in Normal BasL" >=^'^« ^ wUh entncs GF(2). Details on how to compete 
riciu lyiuiiiyii^aiiyu matTix M from g(x) are known m the prior art, e.g., A. 

FIELD OF THE INVENTION Mcnezes et al., "Applications of Finite Fields/* Kluwcr 

Academic Publishers, 1993, and IEEE Standard for Public- 

The present invention relates generally to information j^^y Cryptography, http://stdsbbs.ieee.org/groups/1363/ 

processing systems and devices, such as cryptographic sys- index .html Other details regarding conventional finite field 

terns and devices, which include a capability for multiplying 15 arithmeuc techniques can be found in, e.g., U.S. Pat. No. 

signals of a finite field having a normal basis, 4,587,627 issued May 6, 1986 to J, L Massey and J. K. 

Oraura, entitled "Computational method and apparatus for 

BACKGROUND OF THE INVENTION ^^^^ g^^^ arithmetic," and G.B, Patent No. 2,176,325 issued 

Finite field arithmeUc operations are becoming increas- Dec. 17, 1986 to R. C. MuUin, I. M. Onyszchuk, and S. A. 

ingly important in today's computer systems, particularly ^ Vanstonc, entitled "Finite field mulUphcation in a crypto- 

for cryptographic processing applications. Among the more graphic system^ffisetting sufSxes and rotaUng binary dig- 

common finite fields used in cryptography are odd- its in respective shift registers so as to produce all product 

characteristic finite fields of degree 1, conventionaUy known vector terms simultaneous!/' (related U.S. Pat. No. 4,745, 

as GF(p) arithmetic or arithmetic modulo a prime, and 568 was issued May 17, 1988). 

even-characteristic finite fields of degree greater than 1, ^ Below, we describe a conventional normal basis mu U- 

conventionally known as GF(2'^ arithmetic (where m is the plication formula in two slightly different formats. Ut a^(ao 

degree) GF(2'") arithmetic is further classified according to a ^ . , . a_ J and b-(bo b, . . . b_ J be two elements. Then 

the choice of basis for representing elements of the finite their product c=(Co c, . . . c„_,) can be computed one bit at 

field; two common choices are polynomial basis and normal a time as follows: 

It is known that multiplication in normal basis, particu- 
larly in optimal normal basis (ONB), can be implemented c^H^i 02 • • ^cOMi'i ^2 ■ - • ^oY 

efficienUy in hardware. However, Utde attention has been ^ , oo . . . fl„ , fco • • ■ ^^-2)"^ (i) 
devoted to implementing normal basis multiplication effi- 

ciently in software. A number of difficulties have prevented in formula (1), when a new coefficient needs to be 

the development of fast software implementation of normal computed, the coefficients of both a and b are rotated to the 

basis multiplication. First, when multiplying two elements left by one bit. This allows efficient hardware implemeou- 

represented in normal basis according to the standard i^ons of normal basis multiplication, 

formula, the coefficients of their product need to be com- ^ in a typical "C programming language implemenution 

puted one bit at a time. Second, the computation of a given of formula (1), a, b, and columns of M are all stored in 

coefficient involves a series of arithmetic operations which words. Each matrix-vector multiplication M(bo b^ , . . b„_i) 

need to be performed sequentially in software, while in can be carried out with (m/2)(m/w) exclusive-or operations 

hardware, they can be easily parallelized. on average, and hence the total number of word operations 

We will first define some basic notation for a finite field 45 for con^ting c is about m(m/2) (m/w)«m^/2w. Note that 

GF(2'") and iUrepresenUtion in normal basis. Then, we will the compuUtioo time is independent of the number of 

describe conventional multiplication formulas for both gen- non-zero entries in M. 

eral normal basis and ONB. Ut w denote the word size in Ut M.y denote the entries of matrix M. The following is 

bits. For a typical software implementation, we have wt=32. another way of writing formula (1): 

Ut m be a positive integer. For simplicity, we assume that 50 

w|m. The finite field GF(2"*) consists of 2"* elements, with for k from 0 to m-1 

certain rules for field addition and multiplication. The finite ^ Kf *b ^ (2) 

field GF(2'") has various basis representaUons including . . . . . . ^ yvJ- 

normal basis representation. A binary polynomial is a poly- Throughout the description, the addition operation in a 

nomial with coefficients in GF(2). A binary polynomial is 55 subscript is to be understood as addition modulo the degree 

irreducible if it is not the product of two binary polynomials unless otherwise specified; the symbol denotes AND; 

of smaller degree. For simplicity, we will refer to such a symbols ''T* and "0" denote exctusive-or. In 

polynomial an irreducible polynomial. Irreducible polyno- formula (2), essentially the same expression is used for each 

mials exist for every degree m and can be found efficiently. coefficient c,^ More specifically, given the expression for 

Ut g(x) be an ineducible polynomial of degree m. If P is a sinqily increase the subscripts of a and b by one (modulo 

root of g(x), then the m distinct roots of g(x) in GF(2^ are ^nd the result is the expression for Cj^^^. 

given by Using formula (2), the fewer I's in the multiplication 

2 ^ matrix M, the faster a field multiplication can be done. An 

^-{ftP.P, 1- ONB is a normal basis which has the smallest number of I's 

If the elements of B are linearly independent, then g(x) is 65 in the multiplication matrix M. There are two kinds of ONB, 

called a normal polynomial and B is called a normal basis called type I ONB and type B ONB. that differ in the 

for GF(2™) over GF(2). Normal polynomials exist for ev«y mathematical formulae which define th^n. For both types of 
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ONB the multiplication matrix has exacUy 2ni-l aon-zero I ONB and type II ONB, the number of word operauons 
entries In particular, the first row has a single non-zero involved for computing c is roughly (m/w)(3m)-3m M 
entry and the rest of the rows have exacUy two non-zero Compared with convenUonal m /2w operations using stan- 
cn\n^ In terms of formula (2). the total number of terms (of dard formula (1). the invention can improve computational 
the form a M .b) for each c^ is 2m-l. ONBs only exist for S speed by a factor on the order of m/6. AUhough the lUus- 
certain valuesof degree m. For example, in the range [150, trative embodiment described above is parUcularly well 
200] there are only 15 values of m for v^ch an ONB exists. suited for use in software configured to run on a conven- 

It'is an object of the present invention to provide tional computer with a 32-bit or 64-bit processor, the mvcn- 
improved techniques for multipUcation in a normal basis, tion can be implemented in computers or other systems or 
which are particularly well suited for implementaUon in lO devices with other word lengths, including, e.g., embedded 
software, and can be applied to both general normal basis systems such as pagers, digital notepads or palmtop com- 
and ONB puters with 8-bit processors. As another example, although 

the illustrative embodiment involves multiplication of two 
SUMMARY OF THE INVENTION field elements, the techniques of the invention can be 

e ' ^ 15 extended in a straightforward manner to multiplication of 

The invention provides improved techniques for imple- ^^^^ elements. These and other features of the 

menting normal basis multiplication in processing systems ^^g^^ invention will become more apparent from the 
and devices. The techniques arc particularly well suited for accompanying drawings and the following detaUed descrip- 
iraplemenUtion in software. Using the invention, the coef- ^.^^ 
ficients of a product of field elements can be computed one ^ 

processor word at a Ume, e.g., 32 bits in a 3Z-bit processor, BRIEF DESCRIPTION OF THE DRAWINGS 

as oDDosed to one bit at a time as required by certain . , . i. ^ 

convSupproaches, thereby fully taiing advantage of FIG. 1 shows an UlustraUve embodmient of an enhanced 
rfastword-bakedoperationscurrentlyavailableinmodern normal basis multiplication unit m accordance with the 
processors and software. „ mvenUon. u ^- . f 

An Ulustrative embodiment of the invention includes a ^ FIGS. 2A and 2B show f — ^^^^^ 
fir^t rotator which receives a first input signal representative enhanced nonnal basis multiphers for use ^ the mu^^^^^ 
of a first normal basis field element (ao a, . . . a_,), and a cation unit of RG^ 1, for general normal basis and optmial 
second rotator which receives a second input signal repre- normal basis (ONB), respecUvely. 

sentaUve of a second normal basis field clement (bo b, . . . 30 RG. 3 shows an illustrative embodmient of a rotator 
t»m i)* ^ wo^*^ multiplier receives output signals from the suiUble for use in the normal basis multipliers of FIGS. 2A 
firet and second rotators, corresponding to rotated rcpresen- and 2B. 

tations of the first and second elements, respectively, and pios, 4Aand 4B show illustrative embodiments of word 

processes the rotated representations w bits at a time to multipliers for use in the enhanced normal basis multipliers 
generate an output signal representative of a product of the 35 of HGS. 2A and 2B, respectively. 

first and second elements, where w is a vrord length asso- ^ shows an illustrative embodiment of an enhanced 

ciated with the word multiplier. The rotated representation of normal basis arithmetic unit in accordance with the 
the first element may be given by A[i]=(a,. a.-^j . . . a,.^,^i), invention, incorporating an enhanced normal basis multipU- 
the rotated representation of the second element may be ^^^.^^ ^ ^^^^ piG i 

given by B[iHb, b . . b^, ), and ^^^^ P^^^^^^ "^^V ^ 4o pjo. 6 shows an Ulustrative embodiment of a processing 
given by cKCtO], C[w], Ct2w],^ . . , ^^^^ system/device incorporating the enhanced normal basis 

qi>(c.- c,. . WiX ni.^the degree of the finite field, ^^^^^^^^ 
w is the word length, and i=0, 1, . . . m-1. 

In accordance with another aspect of the invention, the DETAILED DESCRIPTION OF THE 

performance of a normal basis multiplier can be further 45 INVENTION 
improved by precompuling and storing certain elen^nts of ite present invention provides new approaches for imple- 
the rotated representations, ^ch »s elements AEi+w^] and J^J JifipUcation in normal basis. The 

B i.wt]. where t=0. 1. . . - ■^"IJ^lf ^f^; techniques of the invention are particularly well suited for 
A[,Ha. -^.i - • ■ a,.,^i) and BU+^J-f W^b, ^^^^^^^ • • • j^piemenUtion in software. TTie invention will initially be 

'^rS^l'irTIrt"?!^ 'TriTlSrLfv La; ^^^^^^^ ^ » "asic approach for general norma] basis. 
A[m-1], A[m , A[m+1] . . A^2m-1], and amy B may ^^^^ optimization with pre-processing wUl be 

B[^«-ll-^t^^ descnbed.Finally.appUcadonofthebasicapproachtoONB 

include 2m elements of Length w. Words 0 through m-1 in will be descnbed. . ^ 

A and B are then used in compudng C[0], words w through 55 Suppose we want to compute c-ab m normal basis. For 
w+m-1 in A and B are used in computing qw], and the i=0, . . . m-1, we define 
remaining elements C[2wl . . . , C[m-w] of the product c ^^^^ ^^^^ 

are computed in the same manner. Further improvements 

can be provided in the case of an optimal normal basis B[lHbi b^'" 

(ONB) by, for example, precompuling two arrays, Bl[i+ 60 ^^w^ . _ ^ 

m]-Bl[i]=B[mult.array[2n-l]] and B2[i+m]-B2[i]-B " ' * 

[mult-array[2*i]], for the rotated representation B, such that In other words, each A[i] has length w and coaesponds to 
A. Bl, and B2 can be accessed sequentially, where mult- the successive blocks of a in a wrap-around fashion, and 
array is an array with 2m-l entries and is a compact similarly for B[i] and C[i]. Note that c-K^O], Ctw], C[2w], 
representation of the multipUcation matrix M. 65 ... , C[m-w]). Hence, in order to compute c, we only need 

The invention provides improved performance for both to compute C[0], C[w], C[2wl . . . , am-w]. Given the 
general normal basis and ONB. For example, for both type above definitions, we can rewrite formula (2) as foUows: 
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for t from 0 to (m/w-1) 

a»^K/-o. . . . (i+HT)mod mYiX}^ . . 



(3) 



As an example illustrating the operation of formula (3), 
assume m= 1 60 and w=32, which results in the following five 
equations: 



Ct32)=2:._o. jAlO>32)inod m>(2:,.o. . 

32)niod mD, 



10 



Cl64l-2,.o,..., 
64)mod m]). 



9(^mod mj, 
12fi)mod m]). 

It can be seen that the number of equations in formula (3) is 
only m/w, while the number of equations in formula (2) is 
m. In particular, one equation in formula (3) corresponds to 
w consecutive equations in formula (2). 

In a C implementation of formula (3) in accordance with 
the invention, may be carried out as an AND operation 
between two words, and "2" may be carried out as an 
exclusive-or operation between two words. Hence, using 
formula (3), we can compute w bits of c at the same time. 
For general normal basis, the total number of word opera- 
tions for computing c is roughly (m/w) times the number of 
non-zero entries in M. Unlike formula (1), the number of 
operations in formula (3) depends on the number of non- 
zero entries in M. In particular, the fewer the number of 
non-zero entries, the faster the running time will be. Hence, 
formula (3) will provide good performance when M is 
sparse, especially when the basis is ONB. 

The following gives an illustrative implementation of 
formula (3) in C. 

for (t=0; t<m/w); t++) 

{ 

qw*t]=0; 

for (i«0; i<m; i++) 



{ 



t6mp«0; 

forQ-O; j<m; j++) 
{ 

if(M[iJ]-l) 



temp (>B[(j+w*t)% m]; 



For both formulas (3) and (4), we note that the compu- 
tation of each word (A[il, B[i], for i -0, ... , m-1) involves 
one left shift, one right shift, and one exdusive-or. Once 
prccomputation for A and B is done, rotations of the bits of 
A and B are no longer needed during the entire computation 
of c, thus significantly improving the performance. 

We can further speed up the above-described basic 
approach by precomputing and storing A[i] and B[i] in the 
following manner. We first extend the definition of array A 
and B as follows: For i=0, . . . m-1, wc define 



15 (It shotild be noted that in this case, the addition in A[i+m] 
and B[i+m] is a real addition without modulo m.) We 
precompute array A and B, each of which consists of 2m 
elements of length w: 



20 



25 



30 



35 



45 



} 

C[w*t]OA[(i+w*t)%m] & temp; 
} 

We can also interchange the summations for t and i in 
formula (3) and rewrite the formula as follows: 
for t from 0 to (m/w-1) 
C[wt]=0 

for i from 0 to m 

for t from 0 to (m/w-1) 



50 



55 



A[01 A\n .... Ajml A[m+ll .... A[7m-l\ 

Given A and B, we can improve the C code given previously: 
for (k==0; k<m; k+^w) 
{ 

qk]=0; 

for (i=0; i<m; i++) 

{ 

temp=0; 

for 0=0; jon; 

{ 

if(M[y]=l) 
temp O-BO]; 

} 

C[k]O.A[i] & temp; 
} 

A+=w; 
B+=w; 

The above code operates as follows: When computing 
C[0], we use word 0 through m-1 in array A and B (that is, 
the first m words). When computing C[w], we use word w 
through w+m-1 in array A and B, which is accomplished by 
pointer jumping. Similarly, we can compute C[2w], . , . , 
C[m-w]. 

Apphcation of the basic approach to ONB will now be 
described. For ONB, since most of the entries in the mul- 
tiplication matrix M are zero, we can store M in a more 
compact way tising an array called mult-array defined as 
follows: 

k=0; 

for (i-0; i<m; i++){ 
for (j=0; j<m; i++){ 

if(M[i0H){ 
mull-array[k]-j; 
k-H-; 

} 



60 



ClHt)-Ctw/)©(Al(/+w()mod my(^^ . 
niD) 



(4) 



The number of operations for formula (4) is similar to that 
of formula (3). Depending on the particular implementation 
and processor, one might yield better performance than the 
other. 



65 



The C code given previously can be further simplified 
tising the fact that the inner loop j no longer exists, since it 
only involves one or two elements of B. 

for (k«0; k<m; k+ow) 

{ 

ten^)-AlO] & B[mult-anay[0]]; 
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for (iel; i<m; i++) 

temp OA[i] & (B[mult -array [2*i-l]]+B[muU-array 
[2*i]]); 

} 

C[k]=temp; 
A+=w; 
B+=w; 
} 

For both type I ONB and type II ONB, the number of 
word operations involved for computing c is roughly (m/w) 
(3m)-3m^M Compared with conventional m^/2w opera- 
tions using standard formula (1), the invention can provide 
a factor of approximately m/6 improvement in speed. Note 
that the improvement is independent of the word size w. An 
illustrative implementation of the invention with m«160 and 
wo32 showed a factor of 20 improvement It should be noted 
that further improvements may be made to the code given 
above. For example, the code can be further improved by 
prccomputing two separate arrays for the representation B: 

B^Um}-B2[ I'M mult-array ( 2*q^ 

such that the arrays A, Bl, and B2 can be accessed sequen- 
tially. 

FIG. 1 shows an enhanced normal basis multiplication 
unit 10 in accordance with an ill\istrative embodiment of the 
invention. The multiplication unit 10 includes an enhanced 
normal basis multiplier 12 and a multiplication index gen- 
erator 14, The multiplication index generator 14 outputs a 
multiplication index, which may be, e.g., a matrix or an 
array that represents a matrix. The normal basis multiplier 
12 takes as inputs two field elements a, b, and the multipli- 
cation index from generator 14, and ou^uts a field element 
c which is the product of elements a and b in normal basts. 
The multiplication index only needs to be computed once for 
a given normal basis of the finite field, and a given sequence 
of field multiplication (i.e., coab) can then be performed 
using the same multiptication index. 

FIGS. 2A and 2B show more detailed block diagrams of 
the enhanced normal basis multiplier 12 of FIG. 1, for use 
with general normal basis and ONB, respectively. The 
enhanced normal basis multiplier 12 of FIG, 2A includes 
two rotators 22, 24 and a word multiplier 26. The first rotator 
22 takes the first input field clement a and outputs a value A, 
which is a rotated representation of a. Similarly, the second 
rotator 24 takes the second input field clement b and outputs 
a value B, which is a rotated representation of b. The word 
multiplier 26 takes the rotated representations A, B, and the 
multiplication index firom generator 14 as inputs and outputs 
the product c«ab. The normal basis multiplier 12 of FIG. 2A 
is used in the case of a general normal basis, and in this case, 
the multiplication index is an m-by-m matrix. The normal 
basis multiplier 12' of FIG. 2B is used in the case of an ONB, 
and also includes rotators 22, 24 and word multiplier 26 as 
in FIG. 2 A. In the ONB case, the multiplication index is 
represented in a more compact way using an array called 
mull-array with only 2m-l entries. 

FIG. 3 shows an illustrative embodiment of the rotator 22 
of FIGS. 2Aand 2B. The rotator 22 takes a field clement a, 
as shown generally at 32, and rotates/expands it, in a rotate 
and expand operation 34, into an array of words 35, each of 
which has the same length (typically the length of a compute 
word). The rotator 22 in an operation 36 then makes two 
identical copies 38-1 and 38-2 of the array of words 35 to 



9,442 Bl 

8 

produce the rotated representation A. Alternative embodi- 
ments of the rotator are possible. For example, the copying 
operation 36 may be omitted in an alternative embodiment. 
FIGS. 4A and 4B show illustrative embodiments of the 

s word multipUer in the normal basis multipliers of RGS. 2A 
and 2B, respectively. The word multiplier 26 of FIG. 4A is 
for use in the general normal basis case. The product c is 
computed one word at a time in a sequence of operations 42 
whidi includes AND and XOR operations. In eadi step, the 

10 pointer to the rotated representation A and the pointer to the 
rotated representation B are first set to the desired location 
of A and B, respectively. Then, the rotated representations A 
and B, the two pointers to A and B, and the multiplication 
index from generator 14 are processed to produce one word 

15 of the product c. Pointer jumps are provided as shown in 
operation 42. Tlie set of words comprising the product c are 
shown generally at 46. The word multiplier 26' of FIG. 4B 
is for use in ONB case. In this case, the above-noted 
mult-array is used in the word operations 44. 

20 FIGS. 5 and 6 show exemplary appUcations of a normal 
basis multiptication unit in accordance with the invention. 
Many public-key cryptosystems are based on operations in 
finite fields. Two major classes of such cryptosystems arc 
conventional discrete logarithm cryptosystems and elliptic 

25 curve cryptosystems. The present invention is very useful 
for providing performance improvements in these and other 
types of cryptosystems. FIG, 5 shows an enhanced normal 
basis arithmetic unit 50 which includes the enhanced normal 
basis multiplication unit 10 of FIG. 1, a normal basis 

30 squaring unit 52, and a normal basis inversion unit 54. The 
units 10, 52 and 54 are coupled to a memory in the form of 
a set of registers 58. It should be noted that the normal basis 
multiplication unit 10 and the elements thereof, and the other 
units of arithmetic unit 52 and their corresponding elements, 

35 may be configured as software modules executed by a 
processor, as separate dedicated hardware modules, or as 
various combinations of software and hardware. Many other 
configurations of elements utilizing the normal basis multi- 
plication imit 10 will be apparent to those skilled in the art. 

40 FIG. 6 shows a processing system or device 60 which 
includes the enhanced normal basis arithmetic unit 50 
coupled to a cryptographic processor 62, in order to support 
cryptographic operations (e.g., ECDSA) in normal basis. 
The system or device 60 may also include other elements, 

45 e.g., a memory or other processing elements, arrangpd in a 
conventional manner. The system or device 60 may 
represent, for example, a user terminal in a cryptographic 
system, such as a personal desktop or portable computer, 
microcomputer, mainframe computer, workstation, 

50 telephone, personal communication device, pager, palmtop 
computer, digital notepad, television set top box or any other 
type of processing or communication terminal, as well as 
portions or combinations of such systems and devices. The 
processing system or device 60 may include or be comprised 

55 of a microprocessor, central processing unit (CPU), 
apptication-spedfic integrated circuit (ASIC) or any other 
suitable digital data processor. The term "processor** as used 
herein is intended to indudc these and other types of systems 
or devices. 

60 The invention can be used in systems or devices which 
operate in conjunction with data transfer over a global 
computer network such as the Internet, a wide area network 
(WAN), a local area network (LAN), a satellite network, a 
telephone or cable network, or various combinations of 

65 these and other types of networks, using conventional data 
transfer techniques including but not linuted to asynchro- 
nous transfer mode (ATM), synchronous optical network/ 
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synchronous digital hierarchy (SONET/SDH) and/or trans- 
mission control protocol/Internet protocol (TCP/IP). 
Additional details regarding cryptographic applications of 
the normal basis multiplication techniques of the invention 
may be found, for example, in U.S. application Sen No. s 
08/851,045 filed on May 5, 1997 and cntiUcd "Methods and 
Apparatus for Efficient Finite Field Basis Conversion." 

The normal basis multiplication techniques, systems and 
devices described herein are exemplary and should not be 
construed as limiting the present invention to any particular lO 
embodiment or group of embodiments. For example, 
although the illustrative embodiments described above are 
well suited for use in software configured to run on a 
conventional computer with a 32-bit or 64-bit processor, the 
invention can be implemented in computes or other systems 15 
or devices with a other word lengths, including, e.g., embed- 
ded systems such as pagers, digital notepads or palmtop 
computers with 8-bit processors. As another example, 
although the illustrative embodiments involve multiplication 
of two field elements, the techniques of the invention can be 20 
extended in a straightforward manner to multiplication of 
more than two field elements. These and numerous alterna- 
tive embodiments within the scope of the appended claims 
will be readily apparent to those skilled in the art. 

What is claimed is: ^ 

1. An apparatus for multiplying signals represented in a 
normal basis for a finite field, the apparatus comprising: 

a first rotator receiving a first input signal representative 
of a first normal basis element; 

at least one additional rotator, each receiving an input "'^ 
signal representative of a corresponding additional nor- 
mal basis field clement; and 

a word multiplier operative to receive output signals from 
the first and additional rotators, corresponding to 
rotated digital representations of the first and additional 
elements, respectively, and to process the rotated rep- 
resentations w bits at a time to generate an output signal 
representative of a product of the first and additional 
elements, where w is a word length which is associated ^ 
with the word multiplier and which is selected inde- 
peodendy of the degree of the finite field. 

2. The apparatus of claim 1 wherein the at least one 
additional rotator includes a second rotator receiving a 
second input signal representative of a second element, and 
the output signal generated by the word multiplier is repre- 
sentative of a product of the first and second elements. 

3. The apparatus of claim 2 wherein the rotated represen- 
Ution of the first element is given by A[iHai a,vi • • - ^i^^-i)* 
the rotated representation of the second element is given by 
B[i]=(b.-b.-^i . - - b.vH-i)* and the product is given by c=(C[0], 
qw], C[2w], . . . , C[m-w]), where qiHc- c-^^ . . . c^^^i), 
m is the degree of the finite field, and i=0, 1, . . . m-1, 

4. The apparatus of claim 3 wherein the product c-(CIO], 
qw], q2w], . . . , qm-w]) is computed by repeating the 
following computation for t from 0 to (m/w-1): 

Clwt)-!,^. . . . „.tA[{Uwt)mod mKSjua . . . vMO+***) 
mod m]), 

where • represents an AND operation between two words, 2 60 
represents an exdusive-or operation between two words, 
and M[ij] is the multiplication matrix of the normal basis, 

5. The apparatus of claim 4 wherein A[i] and B[i] are 
precomputed and stored for i-0, 1, . . - m-1. 

6. The apparatus of claim 3 wherein the product c«(qO], ^5 
qwl q2w], . . . , qm-w]) is computed as follows: 

for t from 0 to (m/w-1) 



m Bi 
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for i from 0 to m 

for t from 0 to (m/w-1) 

Cl»w]-qK'/]©^[0+H'f)mod mKI^o. . . . -..iAfl^/MO*-t*'Oinod 
ml) 

where • represents an AND operation between two words, 2 
and © both represent exclusive-or operations between two 
words, and M[i j] is the multiplication matrix of the oonnal 

basis. 

7. The apparatus of claim 6 wherein A[i] and B[i] are 
precomputed and stored for i=0, 1, . . . m-1. 

8. The apparatus of claim 3 wherein A[i+m]=A[i]==(a,- 
a,vi . - - a,,^,), B[i+m]=B[i]=(b, b,,, . . . ^.v^^J, array A is 
precomputed as A[0], A[l], . . . , A[m-1], A[m], 
A[m+1], . . . , A[2m-1], and array B is precomputed as B[0], 
B[l], . - - , B[m-1], B[m], B[m+1], . . , , B[2m-1], such that 
each of array A and B include 2m elements of length w. 

9. The apparatus of claim 8 wherein words 0 through m-1 
in A and B are used in computing qo], words w through 
w+m-1 in A and B are used in computing qw], and the 
remaining elements q2w], . . . , qm-w] of the product c 
are computed in the same manner. 

10. The apparatus of claim 1 wherein the word multiplier 
further comprises a multiplication index input and wherein 
the apparatus further comprises a multiplication index gen- 
erator having an output coupled to the multiplication index 
input of the word multiplier. 

11. The apparatus of claim 10 wherein the multiplication 
index generator generates a multiplication index which is an 
m-by-m multiplication matrix M, wherein m is the degree of 
the finite field. 

12. The apparatus of claim ID wherein the normal basis is 
an optimal normal basis, and wherein the multiplication 
index generator generates a multiplication index which is an 
array with 2m- 1 entries, corresponding to a compact rep- 
resentation of a multipUcation matrix M, where m is the 
degree of the finite field. 

13. A method for use in a processor for multiplying 
signals represented in a normal basis for a finite field, the 
method comprising: 

receiving a first input signal representative of a first 
normal basis field clement; 

receiving at least one additional input signal representa- 
tive of a corresponding additional normal basis field 
element; 

generating rotated digital representations of the first and 
additional elements; and 

processing the rotated representations w bits at a time to 
generate an output signal representative of a product of 
the first and additional' elements, where w is a word 
length which is associated with the processor and 
which is selected independently of the degree of the 
finite field. 

14. An article of manufacture comprising a madiine- 
readable medium containing one or more programs for 
multiplying signals represented in a normal basis for a finite 
field, v^ich when executed on a processor, implement the 
steps of: 

receiving a first input signal representative of a first 
normal basis field element; 

receiving at least one additional input signal representa- 
tive of a corrc^wnding additional normal basis field 
element; 

generating rotated digital representations of the first and 
additional elements; and 
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processing the rotated representations w bits at a time to 
generate an output signal representative of a product of 
ihe firs! an additional elements, where w is a word 
length which is associated with the processor and 
which is selected independently of the degree of the 
finite field. 

15. An apparatus for multiplying signals represented in a 
normal basis for a finite field, the apparatus comprising: 

a normal basis multiplication unit comprising a first 
rotator receiving a first input signal representative of a 
first normal basis field element; at least one additional 
rotator, each receiving an input signal representative of 
a corresponding additional normal basis field element; 
and a word multiplier operative to receive output 
signals from the first and additional rotators, corre- 
sponding to rotated digital representations of the first 
and addiitional elements, respectively, and to process 
the rotated representations w bits at a time to generate 
an output signal representative of a product of the first 
and additional elements, where w is a word length 
which is associated with the word multiplier and which 
is selected independently of the degree of the finite 
field; 



20 



at least one of a aonnal basis inversion unit and a normal 
basis squaring unit; and a mwnory associated with at 
least the multiplication unit. 

16. An apparatus for multiplying signals represented in a 
normal basis for a finite field, the apparatus comprising: 

a normal basis multiplication unit comprising a first 
rotator receiving a fiixst input signal representative of a 
first normal basis field element; at least one additional 
rotator, each receiving an input signal representative of 
a corresponding additional normal basis field clement; 
and a word multiplier operative to receive output 
signals from the first and additional rotators, corre- 
sponding to rotated digital representations of the first 
and additional elements, respectively, and to process 
the rotated representations w bits at a time to generate 
an output signal representative of a product of the first 
and additional elements, where w is a word length 
which is associated with the word multiplier and which 
is selected independently of the degree of the finite 
field; and 

a cryptographic processor for implementing one or more 
cryptographic operations utilizing the normal basis 
multiplication unit. 
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